56 research outputs found
Graph learning in robotics: a survey
Deep neural networks for graphs have emerged as a powerful tool for learning
on complex non-euclidean data, which is becoming increasingly common for a
variety of different applications. Yet, although their potential has been
widely recognised in the machine learning community, graph learning is largely
unexplored for downstream tasks such as robotics applications. To fully unlock
their potential, hence, we propose a review of graph neural architectures from
a robotics perspective. The paper covers the fundamentals of graph-based
models, including their architecture, training procedures, and applications. It
also discusses recent advancements and challenges that arise in applied
settings, related for example to the integration of perception,
decision-making, and control. Finally, the paper provides an extensive review
of various robotic applications that benefit from learning on graph structures,
such as bodies and contacts modelling, robotic manipulation, action
recognition, fleet motion planning, and many more. This survey aims to provide
readers with a thorough understanding of the capabilities and limitations of
graph neural architectures in robotics, and to highlight potential avenues for
future research
Opportunities for graph learning in robotics
In the last few years, robotics highly benefited from the use of machine and deep learning to process data stream captured by robots during their tasks. Yet, encoding data in grids (images) or vectors (time-series) significantly limits the type of data that can be processed to euclidean only. To unlock the potential of deep learning also to unstructured data, such as point clouds or functional relations, a rising - yet under-explored
- approach lies on the use of graph neural networks (GNNs). With this manuscript, we intend to deliver a brief introduction to GNNs for robotics applications, together with a concise revision of notable applications in the field, with the aim of fostering the use of this learning strategy in a wider context and highlighting potential future research directions
FreeREA: Training-Free Evolution-based Architecture Search
In the last decade, most research in Machine Learning contributed to the
improvement of existing models, with the aim of increasing the performance of
neural networks for the solution of a variety of different tasks. However, such
advancements often come at the cost of an increase of model memory and
computational requirements. This represents a significant limitation for the
deployability of research output in realistic settings, where the cost, the
energy consumption, and the complexity of the framework play a crucial role. To
solve this issue, the designer should search for models that maximise the
performance while limiting its footprint. Typical approaches to reach this goal
rely either on manual procedures, which cannot guarantee the optimality of the
final design, or upon Neural Architecture Search algorithms to automatise the
process, at the expenses of extremely high computational time. This paper
provides a solution for the fast identification of a neural network that
maximises the model accuracy while preserving size and computational
constraints typical of tiny devices. Our approach, named FreeREA, is a custom
cell-based evolution NAS algorithm that exploits an optimised combination of
training-free metrics to rank architectures during the search, thus without
need of model training. Our experiments, carried out on the common benchmarks
NAS-Bench-101 and NATS-Bench, demonstrate that i) FreeREA is the first method
able to provide very accurate models in minutes of search time; ii) it
outperforms State of the Art training-based and training-free techniques in all
the datasets and benchmarks considered, and iii) it can easily generalise to
constrained scenarios, representing a competitive solution for fast Neural
Architecture Search in generic constrained applications.Comment: 16 pages, 4 figurre
Optimal Reconstruction of Human Motion From Scarce Multimodal Data
Wearable sensing has emerged as a promising solution for enabling unobtrusive and ergonomic measurements of the human motion. However, the reconstruction performance of these devices strongly depends on the quality and the number of sensors, which are typically limited by wearability and economic constraints. A promising approach to minimize the number of sensors is to exploit dimensionality reduction approaches that fuse prior information with insufficient sensing signals, through minimum variance estimation. These methods were successfully used for static hand pose reconstruction, but their translation to motion reconstruction has not been attempted yet. In this work, we propose the usage of functional principal component analysis to decompose multimodal, time-varying motion profiles in terms of linear combinations of basis functions. Functional decomposition enables the estimation of the a priori covariance matrix, and hence the fusion of scarce and noisy measured data with a priori information. We also consider the problem of identifying which elemental variables to measure as the most informative for a given class of tasks. We applied our method to two different datasets of upper limb motion D1 (joint trajectories) and D2 (joint trajectories + EMG data) considering an optimal set of measures (four joints for D1 out of seven, three joints, and eight EMGs for D2 out of seven and twelve, respectively). We found that our approach enables the reconstruction of upper limb motion with a median error of rad for D1 (relative median error 0.9%), and rad and mV for D2 (relative median error 2.9% and 5.1%, respectively)
Bringing Online Egocentric Action Recognition into the wild
To enable a safe and effective human-robot cooperation, it is crucial to
develop models for the identification of human activities. Egocentric vision
seems to be a viable solution to solve this problem, and therefore many works
provide deep learning solutions to infer human actions from first person
videos. However, although very promising, most of these do not consider the
major challenges that comes with a realistic deployment, such as the
portability of the model, the need for real-time inference, and the robustness
with respect to the novel domains (i.e., new spaces, users, tasks). With this
paper, we set the boundaries that egocentric vision models should consider for
realistic applications, defining a novel setting of egocentric action
recognition in the wild, which encourages researchers to develop novel,
applications-aware solutions. We also present a new model-agnostic technique
that enables the rapid repurposing of existing architectures in this new
context, demonstrating the feasibility to deploy a model on a tiny device
(Jetson Nano) and to perform the task directly on the edge with very low energy
consumption (2.4W on average at 50 fps)
Toward brain-heart computer interfaces: A study on the classification of upper limb movements using multisystem directional estimates
Objective. Brain-computer interfaces (BCIs) exploit computational features from brain signals to perform a given task. Despite recent neurophysiology and clinical findings indicating the crucial role of functional interplay between brain and cardiovascular dynamics in locomotion, heartbeat information remains to be included in common BCI systems. In this study, we exploit the multidimensional features of directional and functional interplay between electroencephalographic and heartbeat spectra to classify upper limb movements into three classes. Approach. We gathered data from 26 healthy volunteers that performed 90 movements; the data were processed using a recently proposed framework for brain-heart interplay (BHI) assessment based on synthetic physiological data generation. Extracted BHI features were employed to classify, through sequential forward selection scheme and k-nearest neighbors algorithm, among resting state and three classes of movements according to the kind of interaction with objects. Main results. The results demonstrated that the proposed brain-heart computer interface (BHCI) system could distinguish between rest and movement classes automatically with an average 90% of accuracy. Significance. Further, this study provides neurophysiology insights indicating the crucial role of functional interplay originating at the cortical level onto the heart in the upper limb neural control. The inclusion of functional BHI insights might substantially improve the neuroscientific knowledge about motor control, and this may lead to advanced BHCI systems performances
Entropic Score metric: Decoupling Topology and Size in Training-free NAS
Neural Networks design is a complex and often daunting task, particularly for
resource-constrained scenarios typical of mobile-sized models. Neural
Architecture Search is a promising approach to automate this process, but
existing competitive methods require large training time and computational
resources to generate accurate models. To overcome these limits, this paper
contributes with: i) a novel training-free metric, named Entropic Score, to
estimate model expressivity through the aggregated element-wise entropy of its
activations; ii) a cyclic search algorithm to separately yet synergistically
search model size and topology. Entropic Score shows remarkable ability in
searching for the topology of the network, and a proper combination with
LogSynflow, to search for model size, yields superior capability to completely
design high-performance Hybrid Transformers for edge applications in less than
1 GPU hour, resulting in the fastest and most accurate NAS method for ImageNet
classification.Comment: 10 pages, 3 figure
Incrementality and Hierarchies in the Enrollment of Multiple Synergies for Grasp Planning
Postural hand synergies or eigenpostures are joint angle covariation patterns observed in common grasping tasks. A typical definition associates the geometry of synergy vectors and their hierarchy (relative statistical weight) with the principal component analysis of an experimental covariance matrix. In a reduced complexity representation, the accuracy of hand posture reconstruction is incrementally improved as the number of synergies is increased according to the hierarchy. In this work, we explore whether and how hierarchy and incrementality extend from posture description to grasp force distribution. To do so, we study the problem of optimizing grasps w.r.t. hand/object relative pose and force application, using hand models with an increasing number of synergies, ordered according to a widely used postural basis. The optimization is performed numerically, on a data set of simulated grasps of four objects with a 19-DoF anthropomorphic hand. Results show that the hand/object relative poses that minimize (possibly locally) the grasp optimality index remain roughly the same as more synergies are considered. This suggests that an incremental learning algorithm could be conceived, leveraging on the solution of lower dimensionality problems to progressively address more complex cases as more synergies are added. Second, we investigate whether the adopted hierarchy of postural synergies is indeed the best also for force distribution. Results show that this is not the case
Predicting object-mediated gestures from brain activity: an EEG study on gender differences
Recent functional magnetic resonance imaging (fMRI) studies have identified specific neural patterns related to three different categories of movements: intransitive (i.e., meaningful gestures that do not include the use of objects), transitive (i.e., actions involving an object), and tool-mediated (i.e., actions involving a tool to interact with an object). However, fMRI intrinsically limits the exploitation of these results in a real scenario, such as a brain-machine interface (BMI). In this study, we propose a new approach to automatically predict intransitive, transitive, or tool-mediated movements of the upper limb using electroencephalography (EEG) spectra estimated during a motor planning phase. To this end, high-resolution EEG data gathered from 33 healthy subjects were used as input of a three-class k-Nearest Neighbours classifier. Different combinations of EEGderived spatial and frequency information were investigated to find the most accurate feature vector. In addition, we studied gender differences further splitting the dataset into only-male data, and only-female data. A remarkable difference was found between accuracies achieved with male and female data, the latter yielding the best performance (78.55% of accuracy for the prediction of intransitive, transitive and tool-mediated actions). These results potentially suggest that different gender-based models should be employed for future BMI applications
- …